Skip to content

[rocky8_10] History Rebuild for kernel-4.18.0-553.40.1.el8_10#141

Merged
PlaidCat merged 18 commits into
rocky8_10from
rocky8_10_rebuild
Mar 5, 2025
Merged

[rocky8_10] History Rebuild for kernel-4.18.0-553.40.1.el8_10#141
PlaidCat merged 18 commits into
rocky8_10from
rocky8_10_rebuild

Conversation

@PlaidCat

Copy link
Copy Markdown
Collaborator

This is a Kernel Rebuild request, currently the code for this is internal to CIQ, however over the next quarter we do plan to make it publically available.

General Process:

Contains the following: http://download.rockylinux.org/pub/rocky/8.10/BaseOS/source/tree/Packages/k/

BUILD

[lost to TMUX]

[SNIP]

  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-rocky8_10_rebuild
[TIMER]{MODULES}: 41s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-rocky8_10_rebuild arch/x86/boot/bzImage \
        System.map "/boot"
[TIMER]{INSTALL}: 22s
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-rocky8_10_rebuild and Index to 1
The default is /boot/loader/entries/7907d42970ea4fcdb46e60b2eda1ab89-4.18.0-rocky8_10_rebuild.conf with index 1 and kernel /boot/vmlinuz-4.18.0-rocky8_10_rebuild
The default is /boot/loader/entries/7907d42970ea4fcdb46e60b2eda1ab89-4.18.0-rocky8_10_rebuild.conf with index 1 and kernel /boot/vmlinuz-4.18.0-rocky8_10_rebuild
Generating grub configuration file ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 7s
[TIMER]{BUILD}: 1773s
[TIMER]{MODULES}: 41s
[TIMER]{INSTALL}: 22s
[TIMER]{TOTAL} 1847s
Rebooting in 10 seconds

Boot

[maple@r8-sigcloud-builder ~]$ uname -a
Linux r8-sigcloud-builder 4.18.0-rocky8_10_rebuild #1 SMP Wed Feb 19 20:03:36 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Basic 2 loop

Just checking for crashes

[maple@r8-sigcloud-builder code]$ ./run_kerselftests.sh 2
Starting Test Loop 1
Test Loop 1 Done
Starting Test Loop 2
Test Loop 2 Done

jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.37.1.el8_10
commit-author Benjamin Coddington <bcodding@redhat.com>
commit 1cbc11a

Commmit f5ea161 ("NFSv4: Retry LOCK on OLD_STATEID during delegation
return") attempted to solve this problem by using nfs4's generic async error
handling, but introduced a regression where v4.0 lock recovery would hang.
The additional complexity introduced by overloading that error handling is
not necessary for this case.  This patch expects that commit to be
reverted.

The problem as originally explained in the above commit is:

    There's a small window where a LOCK sent during a delegation return can
    race with another OPEN on client, but the open stateid has not yet been
    updated.  In this case, the client doesn't handle the OLD_STATEID error
    from the server and will lose this lock, emitting:
    "NFS: nfs4_handle_delegation_recall_error: unhandled error -10024".

Fix this by using the old_stateid refresh helpers if the server replies
with OLD_STATEID.

	Suggested-by: Trond Myklebust <trondmy@hammerspace.com>
	Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
	Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
(cherry picked from commit 1cbc11a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.37.1.el8_10
commit-author Martin K. Petersen <martin.petersen@oracle.com>
commit b5fc07a

Commit c92a6b5 ("scsi: core: Query VPD size before getting full
page") removed the logic which checks whether a VPD page is present on
the supported pages list before asking for the page itself. That was
done because SPC helpfully states "The Supported VPD Pages VPD page
list may or may not include all the VPD pages that are able to be
returned by the device server". Testing had revealed a few devices
that supported some of the 0xBn pages but didn't actually list them in
page 0.

Julian Sikorski bisected a problem with his drive resetting during
discovery to the commit above. As it turns out, this particular drive
firmware will crash if we attempt to fetch page 0xB9.

Various approaches were attempted to work around this. In the end,
reinstating the logic that consults VPD page 0 before fetching any
other page was the path of least resistance. A firmware update for the
devices which originally compelled us to remove the check has since
been released.

Link: https://lore.kernel.org/r/20240214221411.2888112-1-martin.petersen@oracle.com
Fixes: c92a6b5 ("scsi: core: Query VPD size before getting full page")
	Cc: stable@vger.kernel.org
	Cc: Bart Van Assche <bvanassche@acm.org>
	Reported-by: Julian Sikorski <belegdol@gmail.com>
	Tested-by: Julian Sikorski <belegdol@gmail.com>
	Reviewed-by: Lee Duncan <lee.duncan@suse.com>
	Reviewed-by: Bart Van Assche <bvanassche@acm.org>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit b5fc07a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.37.1.el8_10
commit-author Guilherme G. Piccoli <gpiccoli@igalia.com>
commit f23a4d6

Commit fc66371 ("scsi: core: Remove the /proc/scsi/${proc_name}
directory earlier") fixed a bug related to modules loading/unloading, by
adding a call to scsi_proc_hostdir_rm() on scsi_remove_host(). But that led
to a potential duplicate call to the hostdir_rm() routine, since it's also
called from scsi_host_dev_release(). That triggered a regression report,
which was then fixed by commit be03df3 ("scsi: core: Fix a procfs host
directory removal regression"). The fix just dropped the hostdir_rm() call
from dev_release().

But it happens that this proc directory is created on scsi_host_alloc(),
and that function "pairs" with scsi_host_dev_release(), while
scsi_remove_host() pairs with scsi_add_host(). In other words, it seems the
reason for removing the proc directory on dev_release() was meant to cover
cases in which a SCSI host structure was allocated, but the call to
scsi_add_host() didn't happen. And that pattern happens to exist in some
error paths, for example.

Syzkaller causes that by using USB raw gadget device, error'ing on
usb-storage driver, at usb_stor_probe2(). By checking that path, we can see
that the BadDevice label leads to a scsi_host_put() after a SCSI host
allocation, but there's no call to scsi_add_host() in such path. That leads
to messages like this in dmesg (and a leak of the SCSI host proc
structure):

usb-storage 4-1:87.51: USB Mass Storage device detected
proc_dir_entry 'scsi/usb-storage' already registered
WARNING: CPU: 1 PID: 3519 at fs/proc/generic.c:377 proc_register+0x347/0x4e0 fs/proc/generic.c:376

The proper fix seems to still call scsi_proc_hostdir_rm() on dev_release(),
but guard that with the state check for SHOST_CREATED; there is even a
comment in scsi_host_dev_release() detailing that: such conditional is
meant for cases where the SCSI host was allocated but there was no calls to
{add,remove}_host(), like the usb-storage case.

This is what we propose here and with that, the error path of usb-storage
does not trigger the warning anymore.

	Reported-by: syzbot+c645abf505ed21f931b5@syzkaller.appspotmail.com
Fixes: be03df3 ("scsi: core: Fix a procfs host directory removal regression")
	Cc: stable@vger.kernel.org
	Cc: Bart Van Assche <bvanassche@acm.org>
	Cc: John Garry <john.g.garry@oracle.com>
	Cc: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
	Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Link: https://lore.kernel.org/r/20240313113006.2834799-1-gpiccoli@igalia.com
	Reviewed-by: Bart Van Assche <bvanassche@acm.org>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit f23a4d6)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…ount

jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.37.1.el8_10
commit-author Martin K. Petersen <martin.petersen@oracle.com>
commit d09c05a

Peter Schneider reported that a system would no longer boot after
updating to 6.8.4.  Peter bisected the issue and identified commit
b5fc07a ("scsi: core: Consult supported VPD page list prior to
fetching page") as being the culprit.

Turns out the enclosure device in Peter's system reports a byteswapped
page length for VPD page 0. It reports "02 00" as page length instead
of "00 02". This causes us to attempt to access 516 bytes (page length
+ header) of information despite only 2 pages being present.

Limit the page search scope to the size of our VPD buffer to guard
against devices returning a larger page count than requested.

Link: https://lore.kernel.org/r/20240521023040.2703884-1-martin.petersen@oracle.com
Fixes: b5fc07a ("scsi: core: Consult supported VPD page list prior to fetching page")
	Cc: stable@vger.kernel.org
	Reported-by: Peter Schneider <pschneider1968@googlemail.com>
Closes: https://lore.kernel.org/all/eec6ebbf-061b-4a7b-96dc-ea748aa4d035@googlemail.com/
	Tested-by: Peter Schneider <pschneider1968@googlemail.com>
	Reviewed-by: Bart Van Assche <bvanassche@acm.org>
	Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
(cherry picked from commit d09c05a)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira LE-2349
cve CVE-2024-50275
Rebuild_History Non-Buildable kernel-4.18.0-553.37.1.el8_10
commit-author Mark Brown <broonie@kernel.org>
commit 751ecf6
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.37.1.el8_10/751ecf6a.failed

The logic for handling SVE traps manipulates saved FPSIMD/SVE state
incorrectly, and a race with preemption can result in a task having
TIF_SVE set and TIF_FOREIGN_FPSTATE clear even though the live CPU state
is stale (e.g. with SVE traps enabled). This has been observed to result
in warnings from do_sve_acc() where SVE traps are not expected while
TIF_SVE is set:

|         if (test_and_set_thread_flag(TIF_SVE))
|                 WARN_ON(1); /* SVE access shouldn't have trapped */

Warnings of this form have been reported intermittently, e.g.

  https://lore.kernel.org/linux-arm-kernel/CA+G9fYtEGe_DhY2Ms7+L7NKsLYUomGsgqpdBj+QwDLeSg=JhGg@mail.gmail.com/
  https://lore.kernel.org/linux-arm-kernel/000000000000511e9a060ce5a45c@google.com/

The race can occur when the SVE trap handler is preempted before and
after manipulating the saved FPSIMD/SVE state, starting and ending on
the same CPU, e.g.

| void do_sve_acc(unsigned long esr, struct pt_regs *regs)
| {
|         // Trap on CPU 0 with TIF_SVE clear, SVE traps enabled
|         // task->fpsimd_cpu is 0.
|         // per_cpu_ptr(&fpsimd_last_state, 0) is task.
|
|         ...
|
|         // Preempted; migrated from CPU 0 to CPU 1.
|         // TIF_FOREIGN_FPSTATE is set.
|
|         get_cpu_fpsimd_context();
|
|         if (test_and_set_thread_flag(TIF_SVE))
|                 WARN_ON(1); /* SVE access shouldn't have trapped */
|
|         sve_init_regs() {
|                 if (!test_thread_flag(TIF_FOREIGN_FPSTATE)) {
|                         ...
|                 } else {
|                         fpsimd_to_sve(current);
|                         current->thread.fp_type = FP_STATE_SVE;
|                 }
|         }
|
|         put_cpu_fpsimd_context();
|
|         // Preempted; migrated from CPU 1 to CPU 0.
|         // task->fpsimd_cpu is still 0
|         // If per_cpu_ptr(&fpsimd_last_state, 0) is still task then:
|         // - Stale HW state is reused (with SVE traps enabled)
|         // - TIF_FOREIGN_FPSTATE is cleared
|         // - A return to userspace skips HW state restore
| }

Fix the case where the state is not live and TIF_FOREIGN_FPSTATE is set
by calling fpsimd_flush_task_state() to detach from the saved CPU
state. This ensures that a subsequent context switch will not reuse the
stale CPU state, and will instead set TIF_FOREIGN_FPSTATE, forcing the
new state to be reloaded from memory prior to a return to userspace.

Fixes: cccb78c ("arm64/sve: Rework SVE access trap to convert state in registers")
	Reported-by: Mark Rutland <mark.rutland@arm.com>
	Signed-off-by: Mark Brown <broonie@kernel.org>
	Cc: stable@vger.kernel.org
	Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Link: https://lore.kernel.org/r/20241030-arm64-fpsimd-foreign-flush-v1-1-bd7bd66905a2@kernel.org
	Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 751ecf6)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/arm64/kernel/fpsimd.c
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..master: 524209
Number of commits in rpm: 11
Number of commits matched with upstream: 5 (45.45%)
Number of commits in upstream but not in rpm: 524204
Number of commits NOT found in upstream: 6 (54.55%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.37.1.el8_10 for kernel-4.18.0-553.37.1.el8_10
Clean Cherry Picks: 4 (80.00%)
Empty Cherry Picks: 1 (20.00%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.37.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Niklas Schnelle <schnelle@linux.ibm.com>
commit 3cd03ea

The Linux implementation of PCI error recovery for s390 was based on the
understanding that firmware error recovery is a two step process with an
optional initial error event to indicate the cause of the error if known
followed by either error event 0x3A (Success) or 0x3B (Failure) to
indicate whether firmware was able to recover. While this has been the
case in testing and the error cases seen in the wild it turns out this
is not correct. Instead firmware only generates 0x3A for some error and
service scenarios and expects the OS to perform recovery for all PCI
events codes except for those indicating permanent error (0x3B, 0x40)
and those indicating errors on the function measurement block (0x2A,
0x2B, 0x2C). Align Linux behavior with these expectations.

Fixes: 4cdf2f4 ("s390/pci: implement minimal PCI error recovery")
	Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
	Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
(cherry picked from commit 3cd03ea)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Sidraya Jayagond <sidraya@linux.ibm.com>
commit ebaf813

Passing MSG_PEEK flag to skb_recv_datagram() increments skb refcount
(skb->users) and iucv_sock_recvmsg() does not decrement skb refcount
at exit.
This results in skb memory leak in skb_queue_purge() and WARN_ON in
iucv_sock_destruct() during socket close. To fix this decrease
skb refcount by one if MSG_PEEK is set in order to prevent memory
leak and WARN_ON.

WARNING: CPU: 2 PID: 6292 at net/iucv/af_iucv.c:286 iucv_sock_destruct+0x144/0x1a0 [af_iucv]
CPU: 2 PID: 6292 Comm: afiucv_test_msg Kdump: loaded Tainted: G        W          6.10.0-rc7 #1
Hardware name: IBM 3931 A01 704 (z/VM 7.3.0)
Call Trace:
        [<001587c682c4aa98>] iucv_sock_destruct+0x148/0x1a0 [af_iucv]
        [<001587c682c4a9d0>] iucv_sock_destruct+0x80/0x1a0 [af_iucv]
        [<001587c704117a32>] __sk_destruct+0x52/0x550
        [<001587c704104a54>] __sock_release+0xa4/0x230
        [<001587c704104c0c>] sock_close+0x2c/0x40
        [<001587c702c5f5a8>] __fput+0x2e8/0x970
        [<001587c7024148c4>] task_work_run+0x1c4/0x2c0
        [<001587c7023b0716>] do_exit+0x996/0x1050
        [<001587c7023b13aa>] do_group_exit+0x13a/0x360
        [<001587c7023b1626>] __s390x_sys_exit_group+0x56/0x60
        [<001587c7022bccca>] do_syscall+0x27a/0x380
        [<001587c7049a6a0c>] __do_syscall+0x9c/0x160
        [<001587c7049ce8a8>] system_call+0x70/0x98
        Last Breaking-Event-Address:
        [<001587c682c4a9d4>] iucv_sock_destruct+0x84/0x1a0 [af_iucv]

Fixes: eac3731 ("[S390]: Add AF_IUCV socket support")
	Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
	Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
	Signed-off-by: Sidraya Jayagond <sidraya@linux.ibm.com>
	Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
	Reviewed-by: David Wei <dw@davidwei.uk>
Link: https://patch.msgid.link/20241119152219.3712168-1-wintera@linux.ibm.com
	Signed-off-by: Paolo Abeni <pabeni@redhat.com>

(cherry picked from commit ebaf813)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Niklas Schnelle <schnelle@linux.ibm.com>
commit 0467cdd
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/0467cdde.failed

Instead of relying on the observed but not architected firmware behavior
that PCI functions from the same card are listed in ascending RID order
in clp_list_pci() ensure this by sorting. To allow for sorting separate
the initial clp_list_pci() and creation of the virtual PCI busses.

Note that fundamentally in our per-PCI function hotplug design non RID
order of discovery is still possible. For example when the two PFs of
a two port NIC are hotplugged after initial boot and in descending RID
order. In this case the virtual PCI bus would be created by the second
PF using that PF's UID as domain number instead of that of the first PF.
Thus the domain number would then change from the UID of the second PF
to that of the first PF on reboot but there is really nothing we can do
about that since changing domain numbers at runtime seems even worse.
This only impacts the domain number as the RIDs are consistent and thus
even with just the second PF visible it will show up in the correct
position on the virtual bus.

	Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
	Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
(cherry picked from commit 0467cdd)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/s390/pci/pci.c
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Niklas Schnelle <schnelle@linux.ibm.com>
commit 126034f
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/126034fa.failed

The newly introduced topology ID (TID) field in the CLP Query PCI
Function explicitly identifies groups of PCI functions whose RIDs belong
to the same (sub-)topology. When available use the TID instead of the
PCHID to match zPCI busses/domains for multi-function devices. Note that
currently only a single PCI bus per TID is supported. This change is
required because in future machines the PCHID will not identify a PCI
card but a specific port in the case of some multi-port NICs while from
a PCI point of view the entire card is a subtopology.

	Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
	Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
(cherry picked from commit 126034f)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/s390/include/asm/pci.h
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Niklas Schnelle <schnelle@linux.ibm.com>
commit 25f39d3
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/25f39d3d.failed

Ensure that VFs used in isolation, that is with their parent PF not
visible to the configuration but with their RID exposed, are treated
compatibly with existing isolated VF use cases without exposed RID
including RoCE Express VFs. This allows creating configurations where
one LPAR manages PFs while their child VFs are used by other LPARs. This
gives the LPAR managing the PFs a role analogous to that of the
hypervisor in a typical use case of passing child VFs to guests.

Instead of creating a multifunction struct zpci_bus whenever a PCI
function with RID exposed is discovered only create such a bus for
configured physical functions and only consider multifunction busses
when searching for an existing bus. Additionally only set zdev->devfn to
the devfn part of the RID once the function is added to a multifunction
bus.

This also fixes probing of more than 7 such isolated VFs from the same
physical bus. This is because common PCI code in pci_scan_slot() only
looks for more functions when pdev->multifunction is set which somewhat
counter intutively is not the case for VFs.

Note that PFs are looked at before their child VFs is guaranteed because
we sort the zpci_list by RID ascending.

	Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
	Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
(cherry picked from commit 25f39d3)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/s390/pci/pci_bus.c
#	arch/s390/pci/pci_clp.c
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Niklas Schnelle <schnelle@linux.ibm.com>
commit 4879610
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/48796104.failed

Prior to commit 0467cdd ("s390/pci: Sort PCI functions prior to
creating virtual busses") the IOMMU was initialized and the device was
registered as part of zpci_create_device() with the struct zpci_dev
freed if either resulted in an error. With that commit this was moved
into a separate function called zpci_add_device().

While this new function logs when adding failed, it expects the caller
not to use and to free the struct zpci_dev on error. This difference
between it and zpci_create_device() was missed while changing the
callers and the incompletely initialized struct zpci_dev may get used in
zpci_scan_configured_device in the error path. This then leads to
a crash due to the device not being registered with the zbus. It was
also not freed in this case. Fix this by handling the error return of
zpci_add_device(). Since in this case the zdev was not added to the
zpci_list it can simply be discarded and freed. Also make this more
explicit by moving the kref_init() into zpci_add_device() and document
that zpci_zdev_get()/zpci_zdev_put() must be used after adding.

	Cc: stable@vger.kernel.org
Fixes: 0467cdd ("s390/pci: Sort PCI functions prior to creating virtual busses")
	Reviewed-by: Gerd Bayer <gbayer@linux.ibm.com>
	Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com>
	Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
(cherry picked from commit 4879610)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/s390/pci/pci.c
#	arch/s390/pci/pci_event.c
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Gerd Bayer <gbayer@linux.ibm.com>
commit 5fd11b9
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/5fd11b96.failed

Factor out adapter interrupt allocation from arch_setup_msi_irqs() in
preparation for enabling registration of multiple MSIs. Code movement
only, no change of functionality intended.

	Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
	Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
(cherry picked from commit 5fd11b9)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/s390/pci/pci_irq.c
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Gerd Bayer <gbayer@linux.ibm.com>
commit ab42fcb
Empty-Commit: Cherry-Pick Conflicts during history rebuild.
Will be included in final tarball splat. Ref for failed cherry-pick at:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/ab42fcb5.failed

On a PCI adapter that provides up to 8 MSI interrupt sources the s390
implementation of PCI interrupts rejected to accommodate them, although
the underlying hardware is able to support that.

For MSI-X it is sufficient to allocate a single irq_desc per msi_desc,
but for MSI multiple irq descriptors are attached to and controlled by
a single msi descriptor. Add the appropriate loops to maintain multiple
irq descriptors and tie/untie them to/from the appropriate AIBV bit, if
a device driver allocates more than 1 MSI interrupt.

Common PCI code passes on requests to allocate a number of interrupt
vectors based on the device drivers' demand and the PCI functions'
capabilities. However, the root-complex of s390 systems support just a
limited number of interrupt vectors per PCI function.
Produce a kernel log message to inform about any architecture-specific
capping that might be done.

With this change, we had a PCI adapter successfully raising
interrupts to its device driver via all 8 sources.

Fixes: a384c89 ("s390/PCI: Fix single MSI only check")
	Signed-off-by: Gerd Bayer <gbayer@linux.ibm.com>
	Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
	Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
(cherry picked from commit ab42fcb)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>

# Conflicts:
#	arch/s390/pci/pci_irq.c
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Greg Jesionowski <jesionowskigreg@gmail.com>
commit ef8a0f6

This adds the vendor and product IDs for the AT29M2-AF which is a
lan7801-based device.

	Signed-off-by: Greg Jesionowski <jesionowskigreg@gmail.com>
Link: https://lore.kernel.org/r/20211214221027.305784-1-jesionowskigreg@gmail.com
	Signed-off-by: Jakub Kicinski <kuba@kernel.org>
(cherry picked from commit ef8a0f6)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
jira LE-2349
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Andreas Gruenbacher <agruenba@redhat.com>
commit 7c9d922

Truncate an inode's address space when flipping the GFS2_DIF_JDATA flag:
depending on that flag, the pages in the address space will either use
buffer heads or iomap_folio_state structs, and we cannot mix the two.

	Reported-by: Kun Hu <huk23@m.fudan.edu.cn>, Jiaji Qin <jjtan24@m.fudan.edu.cn>
	Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
(cherry picked from commit 7c9d922)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
…parse_format

jira LE-2349
cve CVE-2024-53104
Rebuild_History Non-Buildable kernel-4.18.0-553.40.1.el8_10
commit-author Benoit Sevens <bsevens@google.com>
commit ecf2b43

This can lead to out of bounds writes since frames of this type were not
taken into account when calculating the size of the frames buffer in
uvc_parse_streaming.

Fixes: c0efd23 ("V4L/DVB (8145a): USB Video Class driver")
	Signed-off-by: Benoit Sevens <bsevens@google.com>
	Cc: stable@vger.kernel.org
	Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
	Reviewed-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
	Signed-off-by: Hans Verkuil <hverkuil@xs4all.nl>
(cherry picked from commit ecf2b43)
	Signed-off-by: Jonathan Maple <jmaple@ciq.com>
Rebuild_History BUILDABLE
Rebuilding Kernel from rpm changelog with Fuzz Limit: 87.50%
Number of commits in upstream range v4.18~1..master: 524209
Number of commits in rpm: 17
Number of commits matched with upstream: 11 (64.71%)
Number of commits in upstream but not in rpm: 524198
Number of commits NOT found in upstream: 6 (35.29%)

Rebuilding Kernel on Branch rocky8_10_rebuild_kernel-4.18.0-553.40.1.el8_10 for kernel-4.18.0-553.40.1.el8_10
Clean Cherry Picks: 5 (45.45%)
Empty Cherry Picks: 6 (54.55%)
_______________________________

Full Details Located here:
ciq/ciq_backports/kernel-4.18.0-553.40.1.el8_10/rebuild.details.txt

Includes:
* git commit header above
* Empty Commits with upstream SHA
* RPM ChangeLog Entries that could not be matched

Individual Empty Commit failures contained in the same containing directory.
The git message for empty commits will have the path for the failed commit.
File names are the first 8 characters of the upstream SHA

@bmastbergen bmastbergen left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🥌

@gvrose8192 gvrose8192 left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - Thanks!

@PlaidCat PlaidCat merged commit 0dbf877 into rocky8_10 Mar 5, 2025
@PlaidCat PlaidCat deleted the rocky8_10_rebuild branch March 5, 2025 00:32
ciq-kernel-automation Bot pushed a commit that referenced this pull request May 11, 2026
jira VULN-167075
cve CVE-2024-53216
commit-author Yang Erkun <yangerkun@huawei.com>
commit f8c989a

The last reference for `cache_head` can be reduced to zero in `c_show`
and `e_show`(using `rcu_read_lock` and `rcu_read_unlock`). Consequently,
`svc_export_put` and `expkey_put` will be invoked, leading to two
issues:

1. The `svc_export_put` will directly free ex_uuid. However,
   `e_show`/`c_show` will access `ex_uuid` after `cache_put`, which can
   trigger a use-after-free issue, shown below.

   ==================================================================
   BUG: KASAN: slab-use-after-free in svc_export_show+0x362/0x430 [nfsd]
   Read of size 1 at addr ff11000010fdc120 by task cat/870

   CPU: 1 UID: 0 PID: 870 Comm: cat Not tainted 6.12.0-rc3+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
   1.16.1-2.fc37 04/01/2014
   Call Trace:
    <TASK>
    dump_stack_lvl+0x53/0x70
    print_address_description.constprop.0+0x2c/0x3a0
    print_report+0xb9/0x280
    kasan_report+0xae/0xe0
    svc_export_show+0x362/0x430 [nfsd]
    c_show+0x161/0x390 [sunrpc]
    seq_read_iter+0x589/0x770
    seq_read+0x1e5/0x270
    proc_reg_read+0xe1/0x140
    vfs_read+0x125/0x530
    ksys_read+0xc1/0x160
    do_syscall_64+0x5f/0x170
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

   Allocated by task 830:
    kasan_save_stack+0x20/0x40
    kasan_save_track+0x14/0x30
    __kasan_kmalloc+0x8f/0xa0
    __kmalloc_node_track_caller_noprof+0x1bc/0x400
    kmemdup_noprof+0x22/0x50
    svc_export_parse+0x8a9/0xb80 [nfsd]
    cache_do_downcall+0x71/0xa0 [sunrpc]
    cache_write_procfs+0x8e/0xd0 [sunrpc]
    proc_reg_write+0xe1/0x140
    vfs_write+0x1a5/0x6d0
    ksys_write+0xc1/0x160
    do_syscall_64+0x5f/0x170
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

   Freed by task 868:
    kasan_save_stack+0x20/0x40
    kasan_save_track+0x14/0x30
    kasan_save_free_info+0x3b/0x60
    __kasan_slab_free+0x37/0x50
    kfree+0xf3/0x3e0
    svc_export_put+0x87/0xb0 [nfsd]
    cache_purge+0x17f/0x1f0 [sunrpc]
    nfsd_destroy_serv+0x226/0x2d0 [nfsd]
    nfsd_svc+0x125/0x1e0 [nfsd]
    write_threads+0x16a/0x2a0 [nfsd]
    nfsctl_transaction_write+0x74/0xa0 [nfsd]
    vfs_write+0x1a5/0x6d0
    ksys_write+0xc1/0x160
    do_syscall_64+0x5f/0x170
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

2. We cannot sleep while using `rcu_read_lock`/`rcu_read_unlock`.
   However, `svc_export_put`/`expkey_put` will call path_put, which
   subsequently triggers a sleeping operation due to the following
   `dput`.

   =============================
   WARNING: suspicious RCU usage
   5.10.0-dirty #141 Not tainted
   -----------------------------
   ...
   Call Trace:
   dump_stack+0x9a/0xd0
   ___might_sleep+0x231/0x240
   dput+0x39/0x600
   path_put+0x1b/0x30
   svc_export_put+0x17/0x80
   e_show+0x1c9/0x200
   seq_read_iter+0x63f/0x7c0
   seq_read+0x226/0x2d0
   vfs_read+0x113/0x2c0
   ksys_read+0xc9/0x170
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x67/0xd1

Fix these issues by using `rcu_work` to help release
`svc_expkey`/`svc_export`. This approach allows for an asynchronous
context to invoke `path_put` and also facilitates the freeing of
`uuid/exp/key` after an RCU grace period.

Fixes: 9ceddd9 ("knfsd: Allow lockless lookups of the exports")
	Signed-off-by: Yang Erkun <yangerkun@huawei.com>
	Reviewed-by: Jeff Layton <jlayton@kernel.org>
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(cherry picked from commit f8c989a)
	Signed-off-by: CIQ Kernel Automation <ciq_kernel_automation@ciq.com>
roxanan1996 added a commit that referenced this pull request Jun 2, 2026
cve CVE-2024-53216
commit-author Yang Erkun <yangerkun@huawei.com>
commit f8c989a

The last reference for `cache_head` can be reduced to zero in `c_show`
and `e_show`(using `rcu_read_lock` and `rcu_read_unlock`). Consequently,
`svc_export_put` and `expkey_put` will be invoked, leading to two
issues:

1. The `svc_export_put` will directly free ex_uuid. However,
   `e_show`/`c_show` will access `ex_uuid` after `cache_put`, which can
   trigger a use-after-free issue, shown below.

   ==================================================================
   BUG: KASAN: slab-use-after-free in svc_export_show+0x362/0x430 [nfsd]
   Read of size 1 at addr ff11000010fdc120 by task cat/870

   CPU: 1 UID: 0 PID: 870 Comm: cat Not tainted 6.12.0-rc3+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
   1.16.1-2.fc37 04/01/2014
   Call Trace:
    <TASK>
    dump_stack_lvl+0x53/0x70
    print_address_description.constprop.0+0x2c/0x3a0
    print_report+0xb9/0x280
    kasan_report+0xae/0xe0
    svc_export_show+0x362/0x430 [nfsd]
    c_show+0x161/0x390 [sunrpc]
    seq_read_iter+0x589/0x770
    seq_read+0x1e5/0x270
    proc_reg_read+0xe1/0x140
    vfs_read+0x125/0x530
    ksys_read+0xc1/0x160
    do_syscall_64+0x5f/0x170
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

   Allocated by task 830:
    kasan_save_stack+0x20/0x40
    kasan_save_track+0x14/0x30
    __kasan_kmalloc+0x8f/0xa0
    __kmalloc_node_track_caller_noprof+0x1bc/0x400
    kmemdup_noprof+0x22/0x50
    svc_export_parse+0x8a9/0xb80 [nfsd]
    cache_do_downcall+0x71/0xa0 [sunrpc]
    cache_write_procfs+0x8e/0xd0 [sunrpc]
    proc_reg_write+0xe1/0x140
    vfs_write+0x1a5/0x6d0
    ksys_write+0xc1/0x160
    do_syscall_64+0x5f/0x170
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

   Freed by task 868:
    kasan_save_stack+0x20/0x40
    kasan_save_track+0x14/0x30
    kasan_save_free_info+0x3b/0x60
    __kasan_slab_free+0x37/0x50
    kfree+0xf3/0x3e0
    svc_export_put+0x87/0xb0 [nfsd]
    cache_purge+0x17f/0x1f0 [sunrpc]
    nfsd_destroy_serv+0x226/0x2d0 [nfsd]
    nfsd_svc+0x125/0x1e0 [nfsd]
    write_threads+0x16a/0x2a0 [nfsd]
    nfsctl_transaction_write+0x74/0xa0 [nfsd]
    vfs_write+0x1a5/0x6d0
    ksys_write+0xc1/0x160
    do_syscall_64+0x5f/0x170
    entry_SYSCALL_64_after_hwframe+0x76/0x7e

2. We cannot sleep while using `rcu_read_lock`/`rcu_read_unlock`.
   However, `svc_export_put`/`expkey_put` will call path_put, which
   subsequently triggers a sleeping operation due to the following
   `dput`.

   =============================
   WARNING: suspicious RCU usage
   5.10.0-dirty #141 Not tainted
   -----------------------------
   ...
   Call Trace:
   dump_stack+0x9a/0xd0
   ___might_sleep+0x231/0x240
   dput+0x39/0x600
   path_put+0x1b/0x30
   svc_export_put+0x17/0x80
   e_show+0x1c9/0x200
   seq_read_iter+0x63f/0x7c0
   seq_read+0x226/0x2d0
   vfs_read+0x113/0x2c0
   ksys_read+0xc9/0x170
   do_syscall_64+0x33/0x40
   entry_SYSCALL_64_after_hwframe+0x67/0xd1

Fix these issues by using `rcu_work` to help release
`svc_expkey`/`svc_export`. This approach allows for an asynchronous
context to invoke `path_put` and also facilitates the freeing of
`uuid/exp/key` after an RCU grace period.

Fixes: 9ceddd9 ("knfsd: Allow lockless lookups of the exports")
	Signed-off-by: Yang Erkun <yangerkun@huawei.com>
	Reviewed-by: Jeff Layton <jlayton@kernel.org>
	Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
(cherry picked from commit f8c989a)
	Signed-off-by: Roxana Nicolescu <rnicolescu@ciq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

4 participants